Extensive Parameterization And Tuning of Architecture-Sensitive Optimizations
نویسندگان
چکیده
The complexity of modern architectures require compilers to apply an increasingly large collection of architecturesensitive optimizations, e.g., parallelization and cache optimizations, which interact with each other in unpredictable ways. We present a framework to support fine-grained parameterization of these optimizations and flexible tuning of their configuration space. Instead of directly generating optimized code, we extend an optimizing compiler to output its optimization decisions in POET, a scripting language designed for extensive parameterization of source-to-source program transformations. We then use a transformation-aware (TA) search algorithm to support flexible tuning of the parameterized transformation scripts to achieve portable high performance. We have used our framework to apply 6 highly interactive optimizations, parallelization via OpenMP, cache blocking, array copying, unroll-and-jam, scalar replacement, and loop unrolling, and present results of exploring their combined configuration space.
منابع مشابه
ECLAIR: An Efficient Cross Layer Architecture for Wireless Protocol Stacks
Seamless mobility across heterogeneous mobile wireless technologies is now essential as mobile subscribers demand full and cost-effective wireless network coverage. Under such mobility conditions the layered protocol stack is inefficient. Significant research has been done for cross layer optimizations of the protocol stack. To enable rapid deployment of existing and new cross layer optimizatio...
متن کاملParameterization and Search-space Exploitation of Loop Fusion
Traditional compilers are limited in their ability to optimize applications for different architectures because statically modeling the effect of specific optimizations on different hardware implementations is difficult. Recent research has been addressing this issue through the use of empirical tuning, which uses trial executions to determine the optimization parameters that are most effective...
متن کاملProposing a Novel Cost Sensitive Imbalanced Classification Method based on Hybrid of New Fuzzy Cost Assigning Approaches, Fuzzy Clustering and Evolutionary Algorithms
In this paper, a new hybrid methodology is introduced to design a cost-sensitive fuzzy rule-based classification system. A novel cost metric is proposed based on the combination of three different concepts: Entropy, Gini index and DKM criterion. In order to calculate the effective cost of patterns, a hybrid of fuzzy c-means clustering and particle swarm optimization algorithm is utilized. This ...
متن کاملCollective Tuning Initiative: automating and accelerating development and optimization of computing systems
Computing systems rarely deliver best possible performance due to ever increasing hardware and software complexity and limitations of the current optimization technology. Additional code and architecture optimizations are often required to improve execution time, size, power consumption, reliability and other important characteristics of computing systems. However, it is often a tedious, repeti...
متن کاملDesigning OP2 for GPU architectures
OP2 is an “active” library framework for the solution of unstructured mesh applications. It aims to decouple the specification of a scientific application from its parallel implementation to achieve code longevity and near-optimal performance through re-targeting the back-end to different multi-core/many-core hardware. This paper presents the design of the current OP2 library for generating eff...
متن کامل